首页> 外文OA文献 >Effects of Transcription Errors on Supervised Learning in Speech Recognition
【2h】

Effects of Transcription Errors on Supervised Learning in Speech Recognition

机译:转录错误对语音识别中监督学习的影响

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Supervised learning using Hidden Markov Models has been used to train acousticmodels for automatic speech recognition for several years. Typically clean transcriptionsform the basis for this training regimen. However, results have shown that using sources ofreadily available transcriptions, which can be erroneous at times (e.g., closed captions) donot degrade the performance significantly. This work analyzes the effects of mislabeleddata on recognition accuracy. For this purpose, the training is performed using manuallycorrupted training data and the results are observed on three different databases: TIDigits,Alphadigits and SwitchBoard. For Alphadigits, with 16% of data mislabeled, theperformance of the system degrades by 12% relative to the baseline results. For a complextask like SWITCHBOARD, at 16% mislabeled training data, the performance of thesystem degrades by 8.5% relative to the baseline results. The training process is morerobust to mislabeled data because the Gaussian mixtures that are used to model theunderlying distribution tend to cluster around the majority of the correct data. The outliers(incorrect data) do not contribute significantly to the reestimation process.
机译:使用隐马尔可夫模型的监督学习已被用于训练用于自动语音识别的声学模型已有几年了。通常,干净的抄写构成此训练方案的基础。但是,结果表明,使用随时可用的转录源(有时可能会出错)(例如,隐藏字幕)不会显着降低性能。这项工作分析了标签错误的数据对识别准确性的影响。为此,使用手动损坏的训练数据执行训练,并在三个不同的数据库上观察结果:TIDigits,Alphadigits和SwitchBoard。对于字母数字,错误标记了16%的数据,相对于基线结果,系统的性能下降了12%。对于诸如SWITCHBOARD之类的复杂任务,在错误标注训练数据的情况下,如果有16%的训练数据,则系统性能相对于基线结果将下降8.5%。对于标签错误的数据,训练过程更加鲁棒,因为用于建模基础分布的高斯混合往往会聚集在大多数正确数据周围。离群值(错误数据)对重新估计过程的贡献不大。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号